A Brief Discussion on Mainstream AI Service Ecosystems and Related Tools
Recently, while exploring n8n, I realized that the existing services for my Side Project (such as PostgreSQL) were too cumbersome for my needs, so I decided to rewrite the entire architecture.
Furthermore, the new architecture previously planned by AI had significant issues. Although we initially discussed a seemingly reasonable version, two days later, I overturned the n8n plan again and decided to wait until the weekend to properly plan it out.
I am using this cooling-off period to reflect on my actual usage scenarios for various AI tools.
Mainstream AI Service Ecosystems
Currently, the mainstream closed-source AI model providers on the market are primarily OpenAI, Anthropic, and Google. In terms of basic functionality (Q&A, multimodal input, web search, etc.), all three are quite mature.
Each service generally has three systems, with different billing methods and default data privacy settings:
- Developer System: Primarily based on API Keys, billed by usage, suitable for programmatic integration and automated workflows. By default, data from these APIs is not used to train models, with the exception of the Google AI Studio free tier (see the Google section below).
- Enterprise and Team System: Provides unified account management, usage quota control, and advanced SLA guarantees. It is usually based on contract pricing, and data contracts explicitly prohibit usage for training.
- Consumer System: Divided into free plans and subscription tiers, with subscriptions offering stronger models or higher usage limits. Web conversation data may be used for training by default, though each provider offers settings to disable this.
OpenAI Ecosystem
Having started the earliest, OpenAI has built a massive application ecosystem by connecting various external tools and the widely popular GPTs. It is usually the first thing that comes to mind when the general public encounters AI or considers a subscription.
OpenAI was the first AI service I encountered, but it is currently the only one of the three I have never actually subscribed to, so I will only provide an overview here. Subscription plans are divided into GO, Plus, and Pro tiers. API usage (Developer System) and web subscription (Consumer System) are billed separately.
Key tools:
- ChatGPT (Web/App) is the primary conversation interface, featuring a built-in Canvas that allows you to open a separate window next to the chat to edit documents or code directly.
- GPTs allow for the rapid creation of custom bots, supporting external API integration (Actions), making them suitable for embedding into automated workflows.
- GPT Image 1 (text-to-image) is built into the chat and available to free users.
- Sora (video generation) is currently limited to paid plans like Plus / Pro.
Anthropic Ecosystem
Rather than offering a "do-it-all" feature set, Anthropic focuses all its skill points on specific domains like "coding and long-form logical reasoning," without features like image or video generation. This has led to an interesting gap in its user base: it has high visibility in the developer community, with many already subscribed or integrated via API; however, in general companies (including software firms), fewer people subscribe (perhaps due to the high cost?), and some have never even heard of it, knowing only ChatGPT.
Claude has the smallest product line of the three. The developer system uses Anthropic Console to obtain API Keys, billed on a Pay-as-you-go basis. For individual users, the consumer system is divided into Pro / Max subscription tiers.
Note that Claude's token calculation and limitation mechanisms differ from others: the free tier has a very small quota, and if a single conversation becomes too long, you will be forced to "start a new chat" to continue. Although paid subscriptions do not have this hard length limit, because Claude includes the "entire conversation history" in the Context Window when replying, the longer the conversation, the more tokens are consumed per Q&A, and your daily quota will burn faster. Additionally, "Claude Code and web version sharing the same quota" is a common pitfall; at the end of last December, I was unable to use the web version because I had heavily used Claude Code for testing.
Overview of Common Claude Tools
Claude (Web/App): The main conversation interface.
Claude Code A CLI AI Agent tool built specifically for developers. It can run directly in the terminal or IDE, automatically read the local codebase, execute Shell commands, fix bugs, create PRs, and supports MCP (Model Context Protocol) to connect to external tools. It currently offers two usage modes:
- Claude Code on Web: A browser interface that, after connecting to a GitHub account, clones the repository to a cloud VM managed by Anthropic for execution, suitable for scenarios where you don't want to install an environment locally.
- Claude Code on desktop: Integrated into the official Desktop App, providing a graphical interface, including visual Diff comparison, real-time App preview, and the ability to execute multiple local or remote sessions simultaneously.
Claude Cowork An agent preview feature launched by the official Desktop App for paid users, allowing Claude to directly access local files and handle complex, multi-step tasks.
Artifacts A real-time preview feature built into the Claude Web/App. When you ask it to write code, create a webpage, or write a long report, a separate preview window expands next to the chat box, allowing you to see the rendered result directly or copy the entire clean content with one click. This feature solves several annoying problems of traditional web chat interfaces:
- If a code block is wrapped in other Markdown blocks, it gets truncated once the nesting exceeds the limit, and the subsequent content spills outside the block.
- Selecting and copying text with a mouse often ruins the formatting upon pasting.
- Clicking the "Copy" button on a message often includes the AI's conversational filler, which then has to be manually cleaned up.
Google Ecosystem
Possessing massive resources and cross-platform integration capabilities, its feature set is extremely complex. Early market attention was relatively low, but after ChatGPT 5's performance failed to meet expectations late last year, Google took the opportunity to launch Gemini 3.0 Pro, combined with mobile bundles, half-price offers, and all-in-one packages, successfully attracting a large number of users to switch their subscriptions.
Although all AI providers suffer from the issue of outdated training data, because Google's product line is so complex, there are often situations where "the left hand doesn't know what the right hand is doing": sometimes even Gemini itself doesn't know which services are available. For example, when asked about Antigravity-related questions, it had no idea the tool existed; NotebookLM supported Chinese Podcasts by mid-last year, yet when asked in January this year, it replied that it was not yet supported. This situation is particularly noticeable within the Google ecosystem.
1. Developer System (Google AI)
Services are primarily provided through Google AI Studio.
- Provides developers with a free API Key, but with RPM (requests per minute) and other quota limits.
- Can be linked to a credit card to switch to Pay-as-you-go, paying only for what you use.
WARNING
Note: By default, data from the free API Key is used by Google to train base models. This is only automatically disabled after upgrading to the paid version; if you care about privacy, be sure to check this.
2. Enterprise System (Google Cloud / Vertex AI)
Vertex AI hosted on GCP (Google Cloud Platform).
- Designed specifically for enterprise users, providing a complete MLOps toolchain.
- No free quota; billing is based entirely on GCP resources and API usage.
3. Consumer System (Google One)
Google One was originally just for "cloud storage subscriptions," but later, to promote AI, it added subscription plans that include Gemini services.
- About Subscription Plans: The earliest subscription plan including Gemini Advanced was Google One AI Premium. Later, it was split into Plus (added this year) / Pro / Ultra tiers.
- What is Gemini Advanced?
- Many people confuse Google One subscriptions with Gemini Advanced. Simply put, Google One is your "paid subscription plan name," while Gemini Advanced is the "advanced web interface and service" unlocked after payment.
- Only in Gemini Advanced mode can you call Google's higher-end models (such as Gemini 3.1 Pro), enjoy more relaxed conversation limits, and longer context memory.
- Family Sharing and "AI Points":
- Google One can be shared with a family group of 6 people (including yourself).
- Most software benefits (such as each person's Gemini conversation quota) are account-independent. Therefore, some people open 6 accounts to form a family group, and when the main account's quota is exhausted, they switch to other accounts to continue using it.
- However, note: "Cloud Storage" and "AI Points" are shared by the whole family! It is also impossible to limit the usage of individual members separately.
- AI Points are usually consumed when performing compute-intensive generation tasks, such as using Imagen 4 to produce high-quality images or generating videos.
TIP
Now AI Points can also be used for Antigravity model call quotas, though I am quite skeptical about how many calls the 1000 points per month included in the Pro tier can actually provide.
Overview of Common Google Tools
Google has stuffed Gemini into so many products that I have summarized some common tools and usage insights here:
Daily Conversation and Development Assistance
- Gemini App (including Gemini Advanced)
- The most common entry point, available on web and mobile. It is a general-purpose feature carrier that can do a little bit of everything and is very convenient to use; however, for specific tasks requiring precision, it is recommended to use dedicated tools.
- Gemini Code Assist & Gemini CLI
- Gemini Code Assist: An IDE extension. Honestly, it feels a bit awkward; in VS Code, most people are still accustomed to the GitHub Copilot ecosystem. Perhaps the target audience is in other compilers.
- Gemini CLI: A command-line interface that can be configured to connect to Gemini App, Google AI Studio, or Vertex AI accounts.
- Positioning: If you are using an agent editor like Antigravity, you might find the official CLI or Assist less useful. However, for simple, repetitive, high-volume tasks, you can still offload them to Gemini CLI to save your primary quota.
Office and Knowledge Management
- NotebookLM Mainly used in two scenarios:
- Document Summarization and Presentation Generation
- A tool very suitable for administrators or PMs. You can drop meeting minutes or long documents in as "sources" and let it help organize key points and outlines.
- The latest version allows you to use commands to ask it to edit PPTs and even download .pptx files (previously it could only export PDF-formatted PPTs).
- Disadvantages and Limitations:
- Generating a PPT supports a maximum of 15 slides at a time.
- Method to bypass: First, ask the AI to write a complete "PPT outline and slide content" for the entire document. Then, create a new "source" for each 15-slide segment of the transcript. Select only one local source at a time and ask it to generate the PPT. Finally, manually merge these .pptx files in PowerPoint. As long as you keep the prompts (such as layout and tone) consistent for each source, you can indeed produce a presentation longer than 15 slides with a similar style.
- The generated presentation will have a NotebookLM watermark in the bottom right corner (usually forced on non-Ultra subscribers).
- Removal method: The watermark on exported
.pptxfiles is often "baked" into the image (flattened with the background), making it impossible to select and delete independently in PowerPoint. Currently, there are two common solutions in the community: one is to drop the file into Canva and use the built-in "Magic Eraser" feature (requires Canva Pro) to wipe out the watermark (or cover it with shapes); the second is to use online AI background/watermark removal tools like NotebookLM Watermark Remover.
- Removal method: The watermark on exported
- The text inside the generated presentation becomes non-editable images.
- Solution: A free tool commonly used by the community is DeckEdit. It can restore presentations generated by NotebookLM (including layouts converted to images and key illustrations generated by Nano Banana Pro) into PPTX / PDF formats with "directly editable text."
- Generating a PPT supports a maximum of 15 slides at a time.
- Personal Knowledge Base
- Disadvantage: Source management is not very intuitive. Although it is very suitable as a long-term knowledge base, due to inconvenient management, it is mostly used as a one-off document summarization tool.
- When source content changes: If it is a local file, you must delete the old one and re-upload; if it is a Google Drive file, although it supports linking, it does not automatically show updates in the list, and you must manually click the source to trigger a sync.
- Podcast Audio Generation (Audio Overview): Can turn documents into a radio show with a man and a woman chatting. I find the practicality subjective, as the information density is low, leaning more towards demonstrating the vividness of the voice.
- Document Summarization and Presentation Generation
Image Generation and Online Agents
- Google Labs (AI Test Kitchen)
- This is where Google usually places its latest experimental tools. It recently underwent a major overhaul: previously, ImageFX (image generation), Flow (video generation), and MusicFX (music) were separate interfaces. Now, the official version has integrated core functions like image generation and editing into New Flow, allowing users to complete the entire "image generation -> editing -> video generation" workflow in one interface.
- I want to play with it, but I haven't found a scenario to use it yet; such experimental audiovisual features are usually the main consumers of "AI Points."
- Jules (Online Coding Agent)
- An AI agent running in the background. You give it tasks or Issues, and it will open branches, write code, and even issue PRs for you to Code Review.
- Disadvantages:
- It uses Gemini under the hood, which is not very suitable for writing complex projects.
- If you care about Git Commit Message formats, it seems it can only "append" messages. If you are not satisfied with the generated message, you have to manually rewrite it when merging the PR using Squash and Merge.
- Because it is non-interactive, requirements must be written very precisely. For example, when I tested it before, the instructions were too vague, and I couldn't correct it through conversation, resulting in a messy project.
- However, if the project is not complex, it's a good choice to find some simple tasks for it to practice on, while also practicing your own ability to write requirement specifications and perform reviews.
Others
- Almost all Google Workspace services (Docs, Gmail, Drive, etc.) have already integrated or are integrating Gemini add-ons.
Google Ecosystem Summary
In terms of free quota usage, Google is the most generous. The main differences between Google's various tiers lie in "usage frequency (or AI points)," "the level of the underlying model that can be called," and "privacy settings."
Although I have recommended Gemini to some non-engineering friends, it wasn't for its coding ability. While it is sufficient for daily conversation, it is clearly inferior to its competitors when encountering complex DevOps or backend architecture tasks. However, the biggest advantage of the Gemini ecosystem lies in its "all-in-one" binding and diverse UI carriers. It seamlessly integrates Workspace office software and provides various interfaces like Gemini App, NotebookLM, ImageFX, and Jules. Most importantly, the quota mechanisms for different platforms are calculated separately. This "use up the quota on one side and switch to the other" approach is what makes it the best value for money right now.
Web Conversation Interface and Model Usage Experience
Regardless of the provider, most people interact most frequently with the Web/App conversation interface, which is also the most intuitive way to feel the "model personality" of each. The following are just personal impressions and may not reflect reality.
Model Conversation Personalities
TIP
The following is based on GPT‑5.2, Claude 4.5, and Gemini 3.1. GPT‑5.3 Instant (released March 4) and GPT‑5.4 (released March 5) have just been released and I have not yet accumulated enough usage experience.
- ChatGPT (Prone to talking to itself): It is recommended to lower the default enthusiasm level via system instructions; otherwise, it often repeats nonsense from multiple angles regarding the same thing. The overall feeling is that of someone talking to themselves, often ignoring the context or definitions provided by the user and replying according to their own understanding. Many people on the internet have reported that it often fails to answer questions directly and requires several back-and-forth exchanges to get on track.
- Claude (Too neutral to have a stance): When discussing technical issues like code architecture, Claude is actually quite organized; but once the topic extends to softer subjects without clear right or wrong, the "lack of stance" is particularly noticeable. In version 4.5, the most typical problem was: as soon as you added a new piece of information, it would immediately change its stance to cater to you, which was annoying every time it happened. If the context provided was insufficient, it would enter "perfunctory mode," constantly summarizing what you said, and only start responding formally once it understood your inclination—in a way, this way of speaking is a bit like me. When it knows what to express, it is very wordy. However, by version 4.6, the wordiness has improved significantly, and the replies are much more concise. As for other issues, I haven't verified them.
- Gemini (Overly eager to perform and too confident): At first, I thought it had more of a stance than Claude, but later I realized it just has a "strong desire to perform," much like someone who wants to be praised (this refers to the tone, not that it actually wants to be praised). The most typical scenario is when fixing bugs; it often confidently declares, "I made these changes, and here is why this will absolutely succeed," but after execution, it is not only still wrong, but sometimes the error is exactly the same.
Mutual Awareness of Models
I really want to complain. Even though these three have updated their models several times recently, their awareness of each other's latest versions is always stuck in the past. The current test results are roughly as follows:
| Question Target | Perceived Latest ChatGPT | Perceived Latest Claude | Perceived Latest Gemini |
|---|---|---|---|
| ChatGPT | 5 | 3.7 | 2 |
| Claude | 4.5 | 4.6 | 2.5 |
| Gemini Pro | 4.5 | 3.5 | 3.1 |
(Note: Interestingly, Gemini Flash's training data is the most recent; additionally, Antigravity is still not in the training data of Gemini models.)
Sometimes when I ask AI to polish my notes, if I don't keep a close eye on it, it might secretly "downgrade" the correct version numbers in my notes back to the old versions it recognizes. This situation also frequently occurs when modifying Docker Image versions in compose.yml.)
Supplementary Insights on Gemini Model Usage
The following are several issues that are particularly noticeable in actual use, not limited to the web version or Agent, but the behavior of the Gemini model itself:
Polishing style is hard to tune: Gemini is particularly headache-inducing when it comes to proofreading. Either it changes the content to be overly concise and formulaic (it calls this "professional"); or if you ask it to be more conversational, it is filled with metaphors and exaggerated terms like "divine weapon," "significant improvement," "extremely efficient," "best practice," and "comprehensive analysis." When you tell it that you only need reordering and not to delete information, it becomes afraid to change anything at all. In contrast, the Claude model can adjust the arrangement while retaining the original tone and meaning, and the difference in experience is quite significant.
Response speed and stability: Honestly, I don't feel much improvement in the model; the response speed is actually getting slower. When using Gemini Pro in Antigravity, sometimes it gets stuck and interrupts, and when asked about the current status, it takes several minutes to respond; switching to Claude results in a quick reply, and the contrast is obvious.
Severe anomaly events: From the night of March 3 to the early morning of March 4, there was a severe anomaly: whether it was the Gemini App or Antigravity, it inexplicably outputted the thinking process together with the response. Looking closely, the content actually contained other users' personal information or a large amount of content completely unrelated to my questions, as if someone else's content was being output into my conversation. Others in the community have reported similar situations. Related discussions: Threads @freakyketz, Threads @hiphop3535.
Weird logic for providing links: For some reason, Gemini often doesn't provide direct website links, but rather Google search links. I once asked it to polish an article, and it changed all the relative paths of local images into Google search link formats...
Output integrity is unstable: Gemini is good at digesting long background materials, but in terms of the "integrity" of active output, the experience is clearly weaker than other models. Taking Antigravity's Planning mode as an example, it will produce a complete execution plan before execution; but if there is back-and-forth discussion and fine-tuning during the planning process, the subsequently regenerated versions often quietly omit details that were already confirmed, rather than adding or modifying based on the original foundation. When reminded to add them back, if not explicitly specified item by item, it can usually only partially recover them, or even replace the original complete description with a summary version. This tendency is not limited to the carrier; it can also happen in the web version and CLI (see "General Observations on Model Attention" below).
Additionally, note that the number of copies of Antigravity's Implementation Plan is limited (about 20). If there are too many iterations, the correct content of older versions may be permanently lost due to version overwriting. When assigning editing tasks, it is recommended to explicitly limit the scope of modifications to avoid it "optimizing" areas that were not requested, as the original intent is easily lost during repeated iterations.
General Observations on Model Attention
The following applies not only to web versions or Agent editors, but to all AI conversations:
- Large context window ≠ strong attention ability: Many models claim to have a super-large Context Window, but this is different from "being able to maintain attention to the entire conversation." Just like when we talk, we might remember topics from several rounds ago, but if mentioned suddenly in the latest round, we might not immediately react to what is being discussed.
- Over-focusing leads to getting stuck in a rut: When AI spends a lot of time processing the same problem, the attention weight of that problem in the model becomes very high, and it gets stuck there. For example, I once encountered a situation where the container image was upgraded and the settings were different. The correct approach was to directly apply the pre-planned password, but the AI chose to set the development mode at the first instance. I have also encountered situations where the direction was wrong, and despite giving reminders and explaining where its perception was wrong, it insisted that I didn't understand and kept drilling into the wrong path.
- Repeated iterations easily omit details from the beginning: In long conversations, when asking the model to modify or regenerate the same document multiple times, you will often find that the output content "gets shorter and shorter," and details that were confirmed earlier quietly disappear. This is a common limitation of long-context models: the longer the conversation and the more conditions that need to be maintained simultaneously, the model's ability to extract information from the beginning will gradually decline, tending to prioritize the most recent instructions, while details from the beginning are easily glossed over. Practical strategy: before regenerating, explicitly repeat the key paragraphs that must be retained, or limit the model to modifying only a specific range to avoid rewriting the entire text.
Invisible Landmines in File Uploads
Successful upload ≠ the model has read the complete content.
There are quite a few traps in web-based file uploads:
- Each provider has hard limits on the number of uploads. Interestingly, when packing files into a ZIP, the free ChatGPT can bypass the single-file quantity limit with compressed files; but the paid Gemini still strictly checks the number of files inside the compressed file, making the paid experience worse than the free version of others.
- The file parsing mechanism of the Gemini web version is unstable. I once uploaded a Markdown file of about 3500 lines, and during the conversation, I discovered that the model only read the beginning and the end, and the middle had disappeared into thin air. The root cause is that the web interface frontend truncates the attachment during parsing, causing the model to receive incomplete information. Conversely, pasting the same text directly into the chat box allows it to read it completely. In January this year, there was also a situation where the model did not receive the attachment at all in a conversation created from Gem. Gemini said at the time that using Google AI Studio for uploading could avoid this problem, but I haven't verified it (there are many ways to read large amounts of content, such as handing it directly to an Agent editor, without necessarily relying on the web interface).
- Previously, when using ChatGPT, I compressed a project into a ZIP and uploaded it, and the answer was clearly inconsistent with the actual project structure. After questioning, it admitted that it "did not read all the files" and asked me to re-upload the missing parts (the point is that the tone of the reply was "Yeah, I didn't read everything," with a sense of self-righteousness, not feeling that it had any problems at all, which is sometimes really emotional).
Actual Operation of Web Search
In the past year or two, the web conversation interfaces of all providers have supported web search, but the actual mechanism of "web search" is different from what most people imagine: the model does not directly connect to the URL you paste, but searches through a search engine. Gemini uses Google Search, ChatGPT currently uses its own ChatGPT Search, and Claude is uncertain.
History of Web Search for Each Provider
- Google (Gemini): Started as a search engine and has had the most native Web access capability since the Bard era.
- OpenAI (ChatGPT): Opened "Browse with Bing" to Plus users in May 2023, suspended it in July due to paywall content access, and relaunched it in September. SearchGPT prototype was released in July 2024, and it was officially integrated into ChatGPT at the end of October, allowing users of all plans to obtain real-time search information.
- Anthropic (Claude): Started the latest, native search support is slower, and it relied more on third-party tool integration in the past.
This mechanism has several practical implications:
- If your website has not been indexed by a search engine, Gemini will not be able to read it. For example, when I moved my notes to GitHub Pages this year, I encountered this problem initially because Google hadn't indexed it yet. If you find that your website cannot be searched, remember to submit
sitemap.xmlto Google Search Console. - Websites with
robots.txtrestrictions, or GitHub itself (possibly due to authorization concerns), are basically unreadable by all models. - When encountering situations where it cannot read, Gemini makes me feel a bit annoyed. It doesn't want you to know it can't read it, so it starts making up content based on URL information and conversation context; after being questioned, it will apologize and say it was "lazy," then try again, but continue to reply randomly, sometimes with content similar to the previous round.
However, Agent editors (like Antigravity, Copilot) do not have this problem because they usually run directly in your local development environment (or cloud workspace) and have the authority to directly read the dedicated file system, or can directly crawl URL content through built-in tools.
Custom AI Assistants
Currently, all providers have custom AI assistant functions: ChatGPT's GPTs, Gemini's Gem, and Claude's Project. All three allow uploading data and setting roles, but they differ in feature completeness and usage threshold:
- GPTs (ChatGPT): The most complete functionality, can connect to external services (Actions), and is the most flexible of the three. However, it is a paid feature; because I am not a heavy user of this, I have never subscribed and haven't tried downloading existing tools from the GPT Store to play with.
- Project (Claude): Also requires a subscription to use. Conversations initiated from Project are managed uniformly by Project, which is convenient for organizing conversations.
- Gem (Gemini): Can be used without a subscription, making it the friendliest of the three for free users. However, the functionality is more basic and does not support category management; at most, it displays which Gem was used for the conversation record when initiating a new chat.
The main purpose of Gem is to let the model simulate a specific perspective or personality to think about problems, and you can also set workflows and output formats in Gem. As for the best setting format, some people on the internet recommend XML, YAML, Markdown, etc. I personally use a mixed format, but I'm not sure if it's the best practice.
Global Personalization
The web conversation interfaces of all providers offer a "Personalization" function, where you can pre-write some guidelines to make the AI's default behavior closer to your habits.
This has a core difference from custom assistants: custom assistants are tailored for specific domains or tasks; personalization is applied as a general default format for all general conversations. When Gem operates independently, it usually masks global settings and is not affected by them.
Regarding the details of Prompt strategies, why too many rules make AI go off-track, and how to combine positive guidance and negative constraints, I have organized them into another article: Prompt Positive Guidance vs. Negative Constraints: Learning from Stepping on Landmines.
AI Agent Editor
I personally mainly use Antigravity and GitHub Copilot as my two Agent editors.
Antigravity vs Copilot vs Gemini CLI
The billing and usage habits of these three tools are very different. Once you figure them out, it's easy to allocate their usage:
- Antigravity: Suitable for tasks that require "back-and-forth discussion." As an agent editor, you can let it write JavaScript scripts to operate websites while fixing bugs (but don't expect it to accurately fix CSS layout problems). It was very easy to use when using the Claude model before, with a quota reset every 5 hours, but it became less sufficient after Claude added weekly quota limits (if you are in a family group of six, ignore what I said). The Gemini model can be used to discuss project bugs to find inspiration, or you can switch to the Claude model to write plans.
- GitHub Copilot: Suitable for assigning "implementation tasks." The billing method is based on one Q&A per count, but different models have different count weight bonuses (lightweight models consume less, high-end models consume more). Because the unit of count is fixed regardless of whether it is answering questions or large-scale implementation, as long as you increase the iteration limit, giving the execution plan discussed in Antigravity to Copilot for development is more efficient than letting Antigravity run complex tasks and burning through the Claude model quota.
- Gemini CLI: The positioning is a bit awkward, like a low-end version of Antigravity, and you can only choose the Gemini model, while Gemini is prone to errors when used for coding or handling complex tasks. The OAuth mode counts like Copilot, but there is a token limit per count. Unless it's a simple task or running in the background to organize some project documents, I don't use it much.
Gemini Model Script Batch Processing Problem
The Gemini model itself tends to use scripts to batch process tasks, which brings up a problem: When assigning tasks, we actually still need to judge for ourselves whether this requirement is "suitable" for script processing. You don't necessarily have to come up with implementation details, but you still need the judgment to infer whether the basic logic is sound. If you are not sure yourself, but let Gemini run scripts, it is very likely to break things; even if it swears it will use rigorous regular expressions to handle it, don't believe it easily. And once it decides to use a script, if we want to ask it to change to "read each file one by one," both Antigravity and Gemini CLI will be quite resistant, often requiring several back-and-forth exchanges before it is willing to give up the script.
WARNING
As of 2026-03-12, Google AI has adjusted its subscription plan structure, and Antigravity's quota is linked accordingly. The original official announcement is as follows:
We're evolving Google AI plans to give you more control over how you build. Every subscription includes built-in AI credits, which can now be used for Antigravity, giving you a seamless path to scale.
Google AI Pro is the home for the practical builder, hobbyists, students, and developers who live in the IDE and don't necessarily rely on an agent. This plan features generous limits for Gemini Flash, with a baseline quota included to "taste test" our most advanced premium models.
Google AI Ultra serves as the daily driver for those shipping at the highest scale who need consistent, high-volume access to our most complex models.
If you're on Pro but need "extra juice" for a heavy sprint or deeper access to premium models, simply top up your AI credits to customize your plan.
Keep building. Keep shipping.
Simply put, the Pro plan is positioned with Gemini Flash as the main driver, and advanced models (Gemini Pro, Claude) only provide limited trial quotas; Ultra is for those who need high-frequency use of advanced models; if Pro needs more quota temporarily, you can purchase AI Credits separately to supplement.
Impact on actual use: The quota for Gemini Pro and Claude models has been significantly reduced and changed to a Weekly Quota reset. The situation is a bit paradoxical—Gemini CLI, as a lower-end carrier, actually has a relatively generous Gemini Pro quota; while Antigravity has a great experience, the weekly quota for advanced models is very limited, and the cost-performance advantage is significantly weakened.
Since this mechanism may continue to be adjusted, I will no longer track and supplement it.
I have always been a heavy user of Visual Studio and only open VS Code when writing non-.NET projects. So at first, I used Copilot directly in Visual Studio, but the experience at that time felt very difficult to use; it wasn't until the Chinese New Year that I understood that the default Copilot conversation interface in Visual Studio is not Agent mode at all. To get Agent functionality, you must manually switch modes.
And ranked by the completeness of Agent functionality, the three interfaces are roughly Copilot CLI > VS Code > Visual Studio. Copilot CLI has the most setting options and the highest flexibility; VS Code is second, with a smooth interface and convenient extensions; although the official statement for Visual Studio is "deeply optimized for .NET projects," my personal feeling is that the Agent functionality is relatively limited. You don't get a plus, but rather a trade-off for .NET binding. Recently, VS Code's announcements have continued to feed back some CLI features, and the gap is slowly narrowing (refer to VS Code February 2026 version (1.110)).
TIP
However, if I am modifying .NET projects myself, I will still use Visual Studio. Specifically, it depends on whether the task is developer-led or Agent-led. For the former, I use Visual Studio, and for the latter, I use Copilot CLI or Visual Studio Code.
WARNING
When using Copilot in a WSL environment (such as Ubuntu), it is recommended to prioritize VS Code. Copilot CLI currently relies on PowerShell when scripts need to be executed, and WSL usually does not have it installed by default; once an operation requiring PowerShell is encountered, the Agent will not automatically switch to bash, but will require you to execute the command manually and then report the result back to it, seriously affecting the continuity of the automated process.
Agent Context Management
When using an Agent, it is very important to create appropriate rule files and context documents. A few actual observations:
- Without a rule file, AI steps on the same landmine every time: For some temporary tasks, I am too lazy to build files like
AGENTS.md, and as a result, the Agent steps on the same landmine every time it executes, wasting a lot of time before officially entering the task. - Cross-workspace context awareness (expired?): I remember when using Antigravity before, if I opened windows in two different workspaces A and B, the conversation in window A could perceive and read the "currently opened file content" in window B. But recently, this "cross-workspace reading" ability seems to be gone, and the context of each project window has become completely isolated. I suspect this might be due to privacy and security (fear of accidentally leaking confidential code from elsewhere) or token cost considerations, and this global collection mechanism was fixed by the manufacturer.
- Conversation becomes too long and freezes: Antigravity freezes if a conversation is too long, forcing you to start a new one, but it is often too late to compress the conversation content and send it to the new conversation in real-time.
Personal Trade-offs and Postscript
Gemini Pro 3 was actually quite good in December last year, and it answered many questions that all AI providers got wrong in the first half of last year. But the more I use it now, the worse it feels. I can't tell if it has really regressed, or if all models are improving, causing my own judgment standards to rise as well (the frequency of getting angry while chatting with Gemini has been increasing recently). Currently, I have only one expectation: I hope the next version can "say I don't know if it doesn't know," and not always pretend to know everything.
In terms of model selection, I have always believed that Claude is the best model for coding; I was tempted to re-subscribe after Claude upgraded to 4.6. But after thinking about it, the original idea of subscribing was to use one for coding and one for other fields. Copilot is already handling the coding part, and Claude is not particularly outstanding in non-engineering scenarios, so I will just use the free quota to discuss things occasionally for now and don't plan to re-subscribe for the time being.
The high-end Claude Opus 4.6 is also more expensive; while Sonnet 4.6 and GPT-5.3 Codex have a count weight of 1 in Copilot, some in the community believe that GPT-5.3 Codex actually performs better in the actual experience of "coding Agents (automated agents)." For example, Po-Chieh mentioned: use Claude Opus 4.6 for writing documents and making plans, GPT-5.3 Codex (the latest is 5.4) for coding and development, and Gemini 3.1 Pro for writing webpages and design.
However, I have discovered that I have become a heavy AI user. When I want to build something new, the current development process is roughly like this:
- Throw the collected information, ideas, and preliminary discussion records to Antigravity to generate a prototype plan.
- Consult data based on the plan, build my own mental model, and then iterate on the entire plan with AI.
- Finally, hand over the actual development work to Copilot.
Officially join the ranks of PMs who don't write code and only talk.
Change Log
- 2026-03-07 Initial document creation.
- 2026-03-13
- Added explanation regarding Google's adjustment of the Antigravity quota mechanism on 2026-03-12.
- Added issues and suggestions regarding Gemini's unstable output integrity.
- Added limitations of Copilot CLI in WSL environments and rule file reading issues.